Developing Managed User-Defined Aggregates (UDAs)
A
highly specialized feature of SQL Server 2008, managed user-defined
aggregates (UDAs) provide the capability to aggregate column data based
on user-defined criteria built in to .NET code. You can now extend the
(somewhat small) list of aggregate functions usable inside SQL Server to
include those you custom-define.
The implementation contract for a UDA requires the following:
A static method called Init(), used to initialize any data fields in the struct, particularly the field that contains the aggregated value.
A static method called Terminate(), used to return the aggregated value to the UDA’s caller.
A static method called Aggregate(), used to add the value in the current row to the growing value.
A static method called Merge(),
used when SQL Server breaks an aggregation task into multiple threads
of execution (SQL Server actually uses a thread abstraction called a task), each of which needs to merge the value stored in its instance of the UDA with the growing value.
UDAs cannot do any data
access, nor can they have any side-effects—meaning they cannot change
the state of the database. They take only a single input parameter, of
any type. You can also add public methods or properties other than those
required by the contract (such as the IsPrime() method used in the following example).
Like UDTs, UDAs are structs. They are decorated with the SqlUserDefinedAggregate attribute, which has the following parameters for its constructor:
Format—
Tells SQL Server how serialization (and its complement,
deserialization) of the struct should be done. This has the same
possible values and meaning as described earlier for SqlUserDefinedType.
A named parameter list— This list contains the following:
IsInvariantToDuplicates—Tells SQL Server whether the UDA behaves differently with respect to duplicate values passed in from multiple rows.
IsInvariantToNulls—Tells SQL Server whether the UDA behaves differently when null values are passed to it.
IsInvariantToOrder—Tells SQL Server whether the UDA cares about the order in which column values are fed to it.
IsNullIfEmpty—Tells SQL Server that the UDA will return null if its aggregated value is empty (that is, if its value is 0, or the empty string "", and so on).
Name—Tells the deployment routine what to call the UDA when it is created in the database.
MaxByteSize—Tells
SQL Server not to allow more than the specified number of bytes to be
held in an instance of the UDA. You must specify this when using Format.UserDefined.
For this example, you implement a very simple UDA that sums values in an integer column, but only if they are prime. Listing 8 shows the code to do this.
Listing 8. A UDA That Sums Prime Numbers
using System; using System.Data; using System.Data.Sql; using System.Data.SqlTypes; using Microsoft.SqlServer.Server;
[Serializable] [Microsoft.SqlServer.Server.SqlUserDefinedAggregate( Format.Native, IsInvariantToDuplicates=false, IsInvariantToNulls=true, IsInvariantToOrder=true, IsNullIfEmpty=true )] public struct SumPrime { SqlInt64 Sum;
private bool IsPrime(SqlInt64 Number) { for (int i = 2; i < Number; i++) { if (Number % i == 0) { return false; } } return true; }
public void Init() { Sum = 0; }
public void Accumulate(SqlInt64 Value) { if (!Value.IsNull && IsPrime(Value) && Value > 1) Sum += Value; }
public void Merge(SumPrime Prime) { Sum += Prime.Sum; }
public SqlInt64 Terminate() { return Sum; } }
|
In this code, SQL Server first calls Init(), initializing the private Sum data field to 0.
For each column value passed to the aggregate, the Accumulate() method is called, wherein Sum is increased by the value of the column, if it is prime.
When multiple threads converge, Merge() is called, adding the values stored in each instance (as the Prime parameter) to Sum.
When the rowset has been completely parsed, SQL Server calls Terminate(), wherein the accumulated value Sum is returned.
Following are the results of testing SumPrime on Production.Product (an existing AdventureWorks2008 table):
SELECT TOP 10 dbo.SumPrime(p.ProductId) AS PrimeSum, p.Name
FROM Production.Product p
JOIN Production.WorkOrder o ON
o.ProductId = p.ProductId
WHERE Name LIKE '%Frame%'
GROUP BY p.ProductId, p.Name
ORDER BY PrimeSum DESC
go
PrimeSum Name
--------------------------------------------
360355 HL Mountain Frame - Black, 42
338462 HL Mountain Frame - Silver, 42
266030 HL Road Frame - Red, 48
214784 HL Road Frame - Black, 48
133937 HL Touring Frame - Yellow, 46
68338 LL Road Frame - Red, 52
54221 LL Mountain Frame - Silver, 48
15393 ML Road Frame - Red, 52
0 HL Mountain Frame - Black, 38
0 HL Road Frame - Black, 44
(10 row(s) affected.)
Following is the DDL syntax for this UDA:
CREATE AGGREGATE SumPrime(@Number bigint)
RETURNS bigint
EXTERNAL NAME SQLCLR.SumPrime
As with UDTs, with UDAs there is no ALTER AGGREGATE, but you can use DROP AGGREGATE to drop them.